AFIT/GOR/ENS/96M- 13 


STATISTICAL  PROCESS  CONTROL  AND 
MEDICAL  SURVEILLANCE 
An  Application  With  Liver  Function  Tests 

THESIS 


Bryan  D.  Richardson,  Second  Lieutenant,  USAF 
AFIT/GOR/ENS/96M- 13 


Approved  for  public  release;  distribution  unlimited 


19970501  185 


The  views  expressed  in  this  thesis  are  those  of  the  author 
and  do  not  reflect  the  official  policy  or  position  of  the 
Department  of  Defense  or  the  U.S.  Government. 


AFIT/GOR/ENS/96M- 1 3 


STATISTICAL  PROCESS  CONTROL  AND 
MEDICAL  SURVEILLANCE 
An  Application  With  Liver  Function  Tests 


THESIS 


Presented  to  the  Faculty  of  the  School  of  Engineering 
of  the  Air  Force  Institute  of  Technology 
Air  University 
In  Partial  Fulfillment  of  the 
Requirements  for  the  Degree  of 
Master  of  Science  in  Operations  Research 


Bryan  D.  Richardson,  B.S. 
Second  Lieutenant,  USAF 
March  1996 


Approved  for  public  release;  distribution  unlimited 


Thesis  Approval 


STUDENT: 

THESIS  TITLE: 

DEFENSE  DATE: 

COMMITTEE: 

Advisor 

Reader 


2Lt  Bryan  D.  Richardson  CLASS:  GOR96-M 


STATISTICAL  PROCESS  CONTROL  AND 
MEDICAL  SURVEILLANCE 
An  Application  With  Liver  Function  Tests 


20  February  1996 


NAME/DEPARTMENT 

Lt  Col  Kenneth  Bauer/ENS 

Lt  Col  Jack  Jackson/ENS 


SIGNATURE 


Reader 


Lt  Col  Dave  Louis/SGPO 


Acknowledgments 


I  owe  a  great  deal  of  thanks  to  many  people.  Although  I  cannot  begin  to  name 
everyone  who  has  helped  me  in  some  way,  I  would  be  gravely  mistaken  if  I  were  to  leave 
out  my  mentors,  LTC  Ken  Bauer  and  LTC  Jack  Jackson;  my  personal  medical  consultant, 
LTC  Dave  Louis;  and  my  life-saver,  CPT  Paul  McAree,  the  man  who  can  “SASify” 
anything.  Thank  you  aU  so  much! 


Bryan  D.  Richardson 


iii 


Table  of  Contents 


Acknowledgments .  iii 

List  of  Figures .  vi 

List  of  Tables .  vii 

Abstract .  viii 

I.  Introduction .  1 

Background  .  1 

Statement  of  the  Problem .  4 

Research  Objective .  4 

n.  Medical  Literature  Review .  5 

Medical  Surveillance  for  Occupational  Hepatotoxins .  5 

Assessing  Test  VaUdity .  6 

Screening  Enzyme  Tests .  9 

Limitations  of  Detecting  Occupational  Liver  Disease .  11 

Normal  Values .  12 

in.  Methodology .  14 

Methodology  Overview .  14 

Data  Collection .  15 

Data  Conversion .  15 

Classification  of  Work  Zone  Exposure  Areas .  16 

DimensionaUty  Reduction .  17 

Control  Charts .  18 

rv.  Results  .  . .  20 

Overview .  20 

Liver  Function  Test  Distributions .  20 

Upper  Control  Limits .  21 

Multivariate  Analysis .  23 

Control  Charts .  34 

Summary . • .  41 

V.  Final  Remarks  and  Follow-On  Work .  44 

Final  Remarks .  44 

FoUow-On  Work .  45 


IV 


Appendix  A:  Screening  Enzyme  Tests .  47 

Appendix  B:  SAS  Programs .  50 

Appendix  B.l:  CONVERT. SAS  .  50 

Appendix  B.2:  SGPTRAW.SAS  .  51 

Appendix  B. 3:  MERGEALL.SAS  .  52 

Bibliography .  54 

Vita .  56 


V 


List  of  Figures 


Figure  1-1.  Phases  of  liver  disease  development .  2 

Figure  2-1.  Illustration  of  accuracy  measures  and  characteristics 

for  hver  tests .  8 

Figure  3-1.  Database  and  analysis  development .  14 

Figure  4-1.  Distributions  of  liver  function  tests .  22 

Figure  4-2.  Plot  of  factor  1  and  factor  2  scores . 29 

Figure  4-3.  Plot  of  factor  1  and  factor  2  scores  using  five  variables .  32 

Figure  4-4.  Plot  of  factor  1  and  ALT .  33 

Figure  4-5.  1991  standardized  control  chart  for  Transferase 

Index  inputs .  35 

Figure  4-6.  1991  demerit  control  chart .  41 


VI 


List  of  Tables 


Table  1-1.  Variables  collected  for  the  study .  3 

Table  1-2.  Summary  of  data  for  each  liver  test .  4 

Table  2-1.  Tests  for  evaluation  of  liver  disease .  6 

Table  2-2.  Chemical  agents  associated  with  occupational  liver  disease .  12 

Table  2-3.  Normal  values  for  liver  function  tests .  13 

Table  4-1.  Liver  function  test  appUed  to  normal  distributions .  21 

Table  4-2.  90th  through  99th  percentiles  of  workers .  23 

Table  4-3.  Wilk-Shapiro  statistics .  24 

Table  4-4.  Correlation  matrix .  25 

Table  4-5.  Eigenvalues  of  the  correlation  matrix .  25 

Table  4-6.  Matrix  of  eigenvectors .  26 

Table  4-7.  Factor  pattern .  27 

Table  4-8.  Orthogonal  transformation  matrix  for  varimax  rotation .  27 

Table  4-9.  Factor  pattern  after  varimax  rotation .  28 

Table  4-10.  Standardized  scoring  coeflBcients .  28 

Table  4-11.  Eigenvalues  of  the  correlation  matrix  with  five  input  variables .  30 

Table  4- 12.  Factor  patterns  before  and  after  rotation  for  five  variable  case .  31 

Table  4-13.  Orthogonal  transformation  matrix  for  five  variable  case .  31 

Table  4-14.  Standardized  scoring  coefficients  for  five  variable  case .  31 

Table  4-15.  Eigenvalues  of  the  correlation  matrix  using  ALT,  AST,  and  GGT.  .  .  33 

Table  4-16.  Results  from  factor  analysis  on  ALT,  AST,  and  GGT .  33 

Table  4-17.  Standardized  scores  of  zones  above  upper  control  limits  based  on 

abnormal  ALT,  AST,  or  GGT  tests .  37 

Table  4-18.  Standardized  scores  of  zones  above  upper  control  limits  based  on 

abnormal  ALT  criteria .  38 

Table  4-19.  Con:q)arison  of  ALT,  AST,  GGT  criterion  and  ALT  criterion .  39 

Table  4-20.  Summary  of  demerit  control  charts .  42 

Table  4-21.  Summary  of  liver  disease  ‘liot-spots.” .  43 


vii 


AFIT/GOR/ENS/96M- 1 3 


Abstract 

Traditionally,  medical  surveillance  of  liver  disease  generally  involves  a  battery  of 
tests.  This  research  used  multivariate  analysis  techniques  to  reduce  the  number  of 
measures  required  to  identify  liver  dysfiinction  and  found  using  a  Transferase  Index  (a 
combination  of  three  tests;  ALT,  AST,  and  GGT)  provided  the  most  satisfying 
assessment,  but  the  single  best  indicator,  ALT,  may  be  sufficient.  Transferase  Index  and 
ALT  criterion  were  both  apphed  to  SPC  control  charts.  Through  the  use  of  statistical 
process  control  (SPC),  this  research  identified  work  zones  possessing  signs  of  adverse 
effects  to  an  individual’s  liver  as  a  possible  result  of  their  work  environment  and 
demonstrated  SPC  as  an  excellent  way  to  conduct  medical  surveillance.  Industry  has 
embraced  SPC,  and  control  charts,  this  research  extended  their  scope  and  demonstrated 
their  effective  use  in  medical  surveillance  of  the  liver.  This  research  showed  they  provide 
easy,  efficient  ways  to  monitor  work  environments. 
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STATISTICAL  PROCESS  CONTROL  AND  MEDICAL  SURVEILLANCE 


AN  APPLICATION  WITH  LIVER  FUNCTION  TESTS 


L  INTRODUCTION 


Background 

Of  all  the  organs  in  our  bodies,  the  liver  is  one  of  the  most  susceptible  to  injury 
from  drugs  and  environmental  toxins  (Douidar,  1992;  109).  The  liver  plays  a  central  role 
in  the  detoxification  and  elimination  of  foreign  compounds,  known  as  xenobiotics,  we 
encounter  every  day.  Some  of  these  xenobiotics  enter  our  bodies  intentionally  through 
inhalation,  ingestion,  and  skin  absorption  (such  as  alcohol  consun^tion  and  smoking), 
while  others  enter  without  our  awareness.  By  virtue  of  its  role  in  the  metabolism  of 
xenobiotics,  the  liver  is  especially  vulnerable  to  chemical  injury  and  is  thus  of  central 
clinical  interest  (Harrison,  1990a;  247). 

The  Medical  Group’s  Occupational  Medicine  Element  (74th  SGPO)  at  Wright- 
Patterson  Air  Force  Base  (WPAFB)  is  one  organization  with  a  keen  interest  in  xenobiotic 
exposures.  The  mission  of  the  74th  SGPO  is  to  optimize  worker  health  for  aU  civihan  and 
mihtary  employees  at  WPAFB;  achieved  through  monitoring  the  working  environment. 

Of  all  occupational  related  disease,  damage  to  the  liver  is  second  most  common,  only  after 
limg  disease  (Harrison,  1990a;  247). 
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To  facilitate  monitoring  WPAFB  personnel,  the  74th  SGPO  maintahis  a  health 
database,  called  the  PHOENIX  system,  which  contains  information  dating  back  to  1989. 
For  each  individual  monitored,  there  is  a  record  of  their  work  areas  which  include  zones 
(areas  of  common  exposures  within  a  specific  building  or  organization)  and  the  dates  of 
service  in  each  zone.  There  is  also  information  on  their  personal  health  history,  family 
health  history,  liver  fimction  test  results,  and  personal  habits  which  may  contribute  to  liver 
disease,  such  as  alcohol  consumption.  The  PHOENIX  system  is  able  to  monitor  personnel, 
but  is  not  useful  in  an  analytic  sense.  However,  through  the  use  of  various  software 
packages,  the  data  in  PHOENIX  can  be  extracted  and  analyzed  to  provide  the  74th  SGPO 
with  answers  to  questions  related  to  occupational  liver  disease  among  different  work 
zones. 

Etrq)loyees  in  particular  work  zones  are  logical  targets  for  the  screening  of 
occupational  disease  for  two  reasons.  They  have  at  least  some  risk  factors  in  common 
(their  workplace  ejq)osures)  and  they  have  a  clear  opportunity  for  prevention,  reduction  or 
elimination  of  those  exposures  (Levy,  1988;  75).  Typical  liver  disease  development  is 
foxmd  in  Figure  1-1. 


Figure  1-1.  Phases  of  liver  disease  development  (Levy,  1988;  77). 


The  data  extracted  from  the  database  for  use  in  this  study  includes  an  identifier, 
social  security  number,  as  well  as  applicable  personal  history  variables  and  liver  related 
data  elements,  including  liver  fimction  tests  (see  Table  1-1).  Most  individuals  have 
multiple  fiver  function  tests  results  since  surveillance  began  in  1989.  For  exanqile, 
someone  may  have  four  different  ALT  results,  a  fiver  function  test,  each  one  recorded  in  a 
separate  year.  At  most,  an  individual  may  have  seven  test  result  observations.  A  summary 
of  data  collected  for  this  study  is  indicated  in  Table  1-2. 


Data  Contained  in  Variable 

Variable  Used 

Social  Security  Number 

SSAN 

gender 

SEX 

work  zone  when  a  test  was  administered 

ZONE 

date  of  test 

DATE 

history  of  blood  disease  (0-1  variable) 

BLDDIS 

history  of  fiver  disease  (0-1  variable) 

LIVERBAD 

history  of  hepatitis  (0-1  variable) 

HEP 

history  of  jaundice  (0-1  variable) 

JAUNDICE 

ounces  of  liquor  consumed  per  week 

D1 

bottles  of  beer  consumed  per  week 

D2 

glasses  of  wine  consumed  per  week 

D3 

serum  glutamic-pyruvic  transaminase  or 

SGPT  or  ALT 

alanine  aminotransferase  -  a  fiver  test 

serum  glutamic-oxaloacetic  transaminase  or 

SCOT  or  AST 

aspartate  aminotransferase  -  a  liver  test 

y-glutamyl  transferase  -  a  liver  test 

GOT 

bilirubin  -  a  liver  test 

BILIRUBIN 

albumin  -  a  liver  test 

ALBUMIN 

alkaline  phosphatase  -  a  fiver  test 

AP 

white  blood  count 

WBC 

level  of  hematocrit  -  percent  of  blood  volume 

HEMATOCRIT 

occupied  by  cells 

Table  1-1.  Variables  collected  for  the  study. 
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Statement  of  the  Problem 


Using  the  information  extracted  from  the  Occupational  Health  PHOENIX 
Database,  an  analysis  to  determine  any  signs  of  possible  adverse  effects  to  an  individuars 
liver  which  may  be  a  result  of  their  work  environment  is  accomphshed. 


Number  of 
Observations 

ALT 

AST 

GGT 

Bilirubin 

Albumin 

AP 

1 

453 

415 

423 

377 

368 

2 

118 

104 

118 

3 

88 

29 

61 

56 

61 

4 

51 

6o 

19 

32 

33 

5 

54 

50 

2 

25 

34 

33 

6 

32 

19 

5 

18 

23 

7 

11 

4 

5 

6 

TOTAL 

804 

754 

558 

624 

631 

665 

Table  1-2.  Summary  of  data  for  each  liver  test. 


Research  Objective 

Results  from  liver  function  tests  are  analyzed  to  identify  any  trends  the  74th  SGPO 
should  take  action  on.  This  effort  is  intended  to  be  used  as  a  screening  tool  for  the  74th 
SGPO.  The  purpose  of  screening  is  early  identification  of  conditions  which  already  exist 
so  their  progression  can  be  slowed,  halted,  or  even  reversed.  Through  screening  and 
surveillance,  hepatotoxicity  can  be  minimized  and  hopefully  prevented  (Douidar,  1992; 
118).  Therefore,  screening  is  a  secondary  preventive  measure  (Levy,  1988;  75).  If  the 
results  identify  individuals  or  zones  with  abnormal  data,  it  is  the  responsibility  of  the  74th 
SGPO  to  determine  if  liver  disease  is  occupationally  related  and  to  take  the  appropriate 


corrective  action. 


n.  MEDICAL  LITERATURE  REVIEW 


Medical  Surveillance  for  Occupational  Hepatotoxins 

The  objective  of  medical  surveillance  in  a  workplace  is  to  identify  workers  with 
subclinical  diseases  so  that  preventive  and/or  therapeutic  interventions  can  be 
implemented.  Medical  surveillance  can  be  done  through  a  variety  of  screening  methods 
such  as  questionnaires  (which  seek  suggestive  symptoms  or  exposures),  clinical 
examinations  (physicals),  and  laboratory  tests.  In  order  to  be  used  efficiently,  the  methods 
must  be  simple,  noninvasive,  safe,  rapid,  inejqjensive,  and  widely  available  for  routine  use 
(Levy,  1988;  75,  Harrison,  1990a;  255,  and  Harrison,  1990b;  516).  The  “gold  standard” 
for  Kver  testing  is  liver  biopsies  wiiere  a  small  piece  of  the  liver  is  removed.  This 
procedure  is  the  most  accurate  method,  but  is  morbid  and  expensive,  making  other 
alternatives  desirable.  The  primary  alternative  is  a  variety  of  “liver  function  tests,”  serum 
measurements  of  liver  enzymes,  that  are  used  to  characterize  liver  health  (Neuschwander- 
Tetri,  1995;  49).  Various  enzymes,  present  in  large  concentrations  in  liver  cells,  are 
released  into  the  blood  stream  when  the  liver  is  dysfunctional  (damaged  or  destroyed). 
Through  common  blood  tests,  the  levels  of  these  enzymes  are  measured  from  the  serum 
to  provide  biochemical  evidence  of  cell  death,  hepatic  synthesis,  and  the  efficiency  of 
common  liver  processes.  These  biochemical  tests  and  tests  of  synthetic  function  are 
common  for  routine  use.  Another  form  of  testing  is  clearance  tests.  Although  clearance 
tests  are  used  in  some  research  settings,  they  are  not  widely  available  and  not  suggested 
for  routine  use  (Harrison,  1990a;  255).  Tests  for  evaluation  of  liver  disease  can  be  found 
in  Table  2-1. 
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Biochemical  tests  -  levels  of  chemicals  (enzymes) 
Serum  enzyme  activity 

Serum  alkaline  phosphatase 
Serum  lactate  dehydrogenase 
Serum  bilirubin 
Urine  bilirubin 

Test  of  synthetic  liver  function  -  protein  production 
Serum  albumin 
Prothrombin  time 
Alpha-fetoprotein 
Serum  ferritin 

Clearance  tests  -  test  of  functional  ability 
Exogenous  clearance  tests 
Sulfobromophthalein 
Indocyanine  green 
Antipyrine  test 
Aminopyrine  breath  test 
Caffeine  breath  test 
Endogenous  clearance  tests 
Serum  bde  acid 


Table  2-1.  Tests  for  evaluation  of  liver  disease  (Harrison  1990a;  255). 


Assessing  Test  Validity 

The  ideal  screening  tests  for  liver  problems  should  correctly  identify  people  with  an 
abnormal  test  who  truly  have  occupation-associated  liver  disease.  The  common  way  of 
describing  a  test’s  characteristics  is  through  sensitivity  (how  sensitive  the  test  is  at 
detecting  disease)  and  specificity  (how  good  the  test  is  at  rejecting  samples  that  are  not 
diseased)  (Streiner,  1989;  81).  Sensitivity  is  a  measure  of  the  test’s  abihty  to  detect 
people  with  disease  and  is  measured  by: 

Sensitivity  =  Number  with  disease  who  have  a  positive  test 
Number  with  disease 

Conversely,  specificity  measures  the  ability  of  the  test  to  correctly  identify  those  who  do 
not  have  disease  and  is  measured  as  follows: 
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Specificity  =  Number  without  disease  who  have  a  negative  test 
Number  without  disease 

However,  both  of  these  measures  require  some  knowledge  of  the  true  state  of 
aflFairs  (in  their  denominators)  since  they  are  based  on  people  who  do  or  do  not  have 
disease.  Knowledge  about  the  true  state  of  the  liver  requires  a  fiver  biopsy  which  is 
undesirable  due  to  the  morbidity  of  the  procedure.  Another  way  to  assess  the  accmacy  of 
the  tests  is  to  calculate  the  probability  someone  actually  has  (or  does  not  have)  disease 
when  they  test  positive.  Similarly,  we  can  calculate  the  probability  someone  who  tests 
negative  does  or  does  not  have  disease.  These  probabilities  are  called  positive  predictive 
value  and  negative  predictive  value.  Positive  predictive  value  (PPV)  is  the  ratio  of  people 
with  positive  tests  who  actually  have  disease  to  all  positive  tests.  Negative  predictive 
value  (NPV)  is  the  ratio  of  people  with  negative  tests  who  do  not  have  disease  to  all 
negative  tests.  A  high  positive  predictive  value  is  desired  in  screening  tests.  For  an 
illustration  of  test  measures,  see  Figure  2-1. 

Positive  Predicative  Value  =  People  with  positive  test  and  disease 

All  people  with  positive  test 

Negative  Predictive  Value  =  People  with  negative  test  and  no  disease 

All  people  with  negative  test 

The  predictive  value  of  a  test  depends  upon  its  reliability  (ability  of  the  test  to  be 
reproduced),  validity  (sensitivity  and  specificity),  as  well  as  the  prevalence  of  dysfimction 
(how  common  the  disease  is  within  the  population  sampled).  When  prevalence  of  fiver 
disease  is  low  (rare  within  the  population),  the  positive  predictive  value  of  the  test  is  low 


and  negative  predictive  value  is  high.  Conversely,  if  the  prevalence  is  very  high,  the 
negative  predictive  value  is  low,  but  the  positive  predictive  value  is  high  (Douidar,  1992; 
120). 


True  State  of  Nature 

Have  Disease 

No  Disease 

Test 

Positive 

P 

El 

Result 

Negative 

E2 

N 

Sensitivity 

=  P 

PPV=  P 

El  =  False  Positives 

P  +  E2 

P  +  El 

Type  I  Error 

Specificity 

N 

NPV=  N 

E2  =  False  Negatives 

N  +  El 

N  +  E2 

Type  n  Error 

Figure  2-1.  Dlustration  of  accuracy  measures  and  characteristics  for  liver  tests. 

Test  errors  can  be  made  in  two  ways,  false  positives  and  false  negatives.  False 
positives,  positive  tests  in  the  absence  of  disease,  are  typically  elevated  en2yme  levels  due 
to  nonoccupational  causes.  They  must  be  minimized  to  avoid  costly  and  imnecessary 
cUnical  and/or  worksite  intervention.  The  medical,  social,  and  economic  costs  of 
incorrectly  identifying  a  worker  as  having  a  disease  can  have  enormous  effects  (Harrison, 
1990b;  516).  False  negatives  (normal  values  despite  the  presence  of  liver  dysfunction) 
renders  preventive  medicine  ineffective  and  allows  workers  to  return  to  a  dangerous 
environment  (Douidar,  1992;  120). 
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Screening  Enzyme  Tests 

An  ideal  test  for  detection  of  liver  dysfimction  would  be  sensitive  enough  to  detect 
minimal  liver  disease,  specific  enough  to  exclude  normal  livers,  and  capable  of  reflecting 
the  severity  of  the  underlying  problem.  The  choice  of  tests  used  are  based  on  practical 
criteria  such  as  noninvasiveness,  simplicity  of  test  performance,  availability  of  resources, 
adequacy  of  test  analysis,  and  cost  to  ensure  efficiency  (Harrison,  1990a;  255).  The  use  of 
these  criteria  eliminates  liver  biopsies  as  a  useful  surveillance  tool  despite  the  fact  they  are 
the  “gold  standard.”  The  two  most  important  criteria  for  this  study  are  accurate  tests  and 
availability  of  data  (data  that  has  already  been  collected).  Therefore,  the  next  best 
alternative  is  liver  function  tests  since  they  have  proven  to  be  reUable  indicators  of  many 
common  forms  of  liver  disease  (Neuschwander-Tetri,  1995;  49).  Further,  presence  of 
hepatic  disease  is  usually  first  identified  by  these  tests  (Harrison,  1990a;  247  and  Harrison, 
1990b;  515).  Resuhs  from  six  common  liver  function  tests  are  in  PHOENIX  Database 
System:  alanine  aminotransferase  (ALT),  aspartate  aminotransferase  (AST),  gamma- 
glutamyl  transferase  (GGT),  bilirubin  (BR),  albmnin,  and  alkaline  phosphatase  (AP). 

Performance  measures  on  the  tests,  sensitivity  and  specificity,  assess  the  adequacy 
of  these  six  tests.  The  most  common  and  useful  serum  enzymes  in  screening  are  the 
aminotransferases:  alanine  aminotransferase  (ALT),  previously  known  as  serum  glutamic- 
pyruvic  transaminase  (SGPT),  and  aspartate  aminotransferase  (AST),  previously  called 
serum  glutamic-oxaloacetic  transaminase  (SCOT)  (Harrison,  1990a;  255  andLeevy,  1980; 
499).  Transferase  levels  are  due  to  release  of  enzyme  protein  from  liver  cells  as  a  result  of 
cell  turnover  or  injury.  Elevations  of  serum  aminotransferase  activity  can  occiu  with 
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minor  cell  injury,  making  such  determinations  useful  in  the  early  detection  and  monitoring 
of  liver  disease  of  drug  or  chemical  origin.  Serum  transferases  have  a  relatively  high 
sensitivity  for  detection  of  liver  disease  and  remain  the  test  of  choice  for  routine 
surveillance  (Harrison  1990a;  255).  However,  a  serious  drawback  in  using  transferases  is 
the  lack  of  specificity  in  that  they  may  be  elevated  due  to  other  mechanisms  which  may  or 
may  not  be  identifiable  in  a  clinical  context  (Leevy,  1980;  499). 

Serum  gamma-glutamyl  transferase  (GGT)  is  considered  a  more  sensitive  indicator 
than  aminotransferase  of  drug-,  virus-,  chemical-,  and  alcohol-  induced  hepatocellular 
damage  (Leevy,  1980;  501).  However,  because  of  its  severe  lack  of  specificity,  one  must 
interpret  abnormalities  in  conjunction  with  other  tests  making  GGT  alone  an  inconq)lete 
battery  in  screening  for  hepatotoxicity  (Harrison,  1990a;  255  and  Leevy,  1980;  501). 
Serum  bilirubin  is  of  some  value  in  detecting  toxic  cholestatic  liver  injury  but  is  frequently 
normal  in  the  presence  of  mild  and  common  cellular  damage  (Harrison,  1990a;  256). 
Serum  albumin  concentration  maybe  a  useful  index  of  cellular  dysfunction  in  liver  disease. 
It  is  of  Httle  value  in  differentiating  type  of  liver  dysfunction  (Harrison,  1990a;  256). 

Serum  alkaline  phosphatase  (AP)  activity  may  originate  from  the  liver,  bone,  intestine,  or 
placenta  (Harrison,  1990a;  256).  The  normal  function  of  AP  is  not  fully  understood 
(Neuschwander- Tetri,  1995;  53).  For  a  more  complete  discussion  on  specific  tests,  see 
Appendix  A. 

There  is  some  mixed  opinions  on  the  adequacy  of  the  different  liver  function  tests 
within  the  medical  community.  Most  physicians  recommend  workplace  screening  for 
hepatotoxicity  with  the  standard  serum  transferases;  that  is  ALT  and  AST  (Harrison, 
1990a;  255).  Some  others  recommend  initial  screening  with  AP  followed  by  confirmation 
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withGGT.  The  federal  government  has  recommended  large  batteries  of  tests.  This  study 

explores  the  adequacy  of  these  tests  and  demonstrates  they  are  the  primary  indicators  of 
liver  dysfunction  in  a  screening  application. 

Limitations  of  Detecting  Occupational  Liver  Disease 

There  are  a  number  of  ways  to  detect  liver  dysfunction.  However,  difficulty  arises 
in  isolating  the  causes  of  liver  disease  since  exposure  to  liver  disease  causing  agents  is  not 
limited  to  the  workplace.  Exposure,  whether  jfrom  the  home,  environment,  or  the 
workplace,  has  the  same  damaging  effects  on  the  liver.  With  the  exception  of  a  few 
chemicals  that  cause  specific  lesions,  hepatic  injury  due  to  workplace  exposure  does  not 
differ  clinically,  morphologically,  or  structurally,  from  most  drug-induced  damage.  Thus, 
it  may  be  difficult  to  differentiate  between  occupational  and  nonoccupational  causes  on  the 
basis  of  screening  tests  discussed  above  (Harrison,  1990a;  247).  A  partial  fist  of  specific 
conq)ounds,  the  resulting  injury,  and  typical  uses  are  found  in  Table  2-2. 

Fiuther  difficulty  arises  since  liver  enzyme  tests,  while  moderately  sensitive,  may 
not  be  specific  and  have  poor  positive  predictive  value  in  identifying  true  occupational 
liver  disease.  In  addition,  little  is  known  about  the  synergistic  effects  of  multiple 
hepatotoxic  exposures  common  to  many  occupations.  This  study  is  limited  to  identifying 
clusters  in  liver  disease  and  does  not  address  specific  exposures  or  potential  synergistic 
effects.  It  should  also  be  recognized  that  these  screening  tests  only  presumptively  identify 
individuals  who  are  likely,  or  unlikely,  to  have  liver  disease.  Further  tests  are  necessary  to 
diagnosis  and  assess  the  severity  of  an  individual’s  condition,  which  is  left  to  the  74th 
SGPO  (Levy,  1988;  75). 


11 


Type  of  Injury 

Occupation  or  Use 

Arsenic 

Cirrhosis,  hepatocellular 
carcinoma,  angiosarcoma 

Pesticides 

Granulomatous 

Ceramics  workers 

Carbon  tetrachloride 

Acute  hepatocellular  injury, 
cirrhosis 

Dry  cleaning 

Dimethylformamide 

Acute  hepatocellular  injury 

Solvent,  chemical  mfg. 

Dimethylnitrosamine 

Hepatocellular  carcinoma 

Dioxin 

Porphyria  cutanea  tarda 

Pesticides 

Halothane 

Acute  hepatocellular  injury 

Anesthesiology 

Hydrazine 

Steatosis 

Methylene  dianiline 

Cholestasis 

MDA  production  workers 

Acute  hepatocellular  injury 

Painters 

Acute  hepatocellular  injury 

Munitions  workers 

Polychlorinated 

biphenyl 

Subacute  liver  injury 

Production,  electrical  utihty 

Tetrachloroethane 

Aircraft  mfg. 

Trichloroethylene 

Leanhig  solvent  sniffing 

Trinitotolulene 

Munitions  workers 

Vinyl  chloride 

Angiosarcoma 

Vinyl  chloride  workers 

Table  2-2:  Chemical  agents  associated  with  occupational  liver  disease. 


Normal  Values 

In  terms  of  liver  function  tests,  it  is  difficult  to  know  what  represents  normal  and 
abnormal  values  (Douidar,  1992;  118).  Discrepancies  arise  because  normal  values  for 
aminotransferase  activities  depend  on  technique  and  conditions  as  well  as  the  composition 
of  normal  control  populations  (Leevy,  1980;  499).  In  order  to  tailor  this  investigation  to 
the  personnel  at  WPAFB,  the  normal  values  used  for  this  study  correspond  to  standards 
estabUshed  by  a  August  1994  study  done  by  the  laboratories  at  the  74th  SGPO  (see  Table 
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2-3).  In  doing  so,  the  composition  of  the  control  population  and  technique  used  to  obtain 
the  data  conform  to  the  entire  study  group. 


ALT 

AST 

GGT 

BILIRUBIN 

ALBUMIN 

AP 

Lab  AUG87 

14-75 

14-40 

5-85 

0.4- 1.4 

3.9-5. 1 

12-37 

Lab  DEC90  male 

■mo 

39-117 

female 

39-117 

Lab  AUG94  male 

0-40 

0-37 

11-50 

0-1 

3.4-5.0 

39-117 

female 

0-31 

0-31 

7-32 

0-1 

3.4-5.0 

39-117 

Lab  Software  male 

0-40 

0-37 

1-44 

0.2-1.2 

3.2-4.7 

50-136 

female 

0-40 

0-37 

3-24 

0.2-1.2 

3.2-4.7 

50-136 

Table  2-3.  Normal  values  for  liver  function  tests 


m.  METHODOLOGY 


Methodology  Overview 

A  brief  discussion  on  aspects  of  medical  surveillance  and  screening  tests  are 
important  to  understanding  the  direction  of  this  study.  This  chapter  provides  the 
methodology  used  for  the  remainder  of  the  study.  It  outlines  the  steps  in  transferring  the 
database  from  PHOENIX  to  the  SAS  System,  the  steps  used  in  creating  a  workable 
database,  the  programs  used  for  the  analysis,  and  a  brief  discussion  on  techniques  used  for 
the  analysis.  This  thesis  effort  was  accomplished  in  conjunction  with  similar  research  done 
on  pulmonary  fimction  tests.  Therefore,  some  lung  information  may  be  found  in  the 
programs  used  to  develop  the  database  for  this  research.  Figure  3- 1  depicts  the  flow  of 
this  process. 


Figure  3-1.  Database  and  analysis  development. 
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Data  Collection 


The  74th  SGPO  has  maintained  their  health  database,  called  PHOENIX,  on 
WPAFB  personnel  since  1989.  The  liver  function  tests  are  performed  at  the  base  hospital 
with  the  results  initially  hand-written  on  lab  test  result  forms.  From  there,  personnel 
manually  transfer  the  information  from  test  result  forms  to  data  entry  sheets  and  finally 
into  the  database,  all  with  no  error  checking  procedures.  The  multiple  opportunities  for 
error  may  be  a  cause  for  concern. 

Performing  simple  queries  under  the  Data  Base  Reporting  option  isolates  and 
stores  each  query  in  a  separate  file;  seven  separate  files  were  extracted  from  PHOENIX. 
Downloading  the  files  to  floppy  disk  as  flat  ASCII  files  allowed  them  to  be  transferred  to 
the  UNIX  mainframe  system  at  AFIT  via  the  WS-FTP  protocol. 

Data  Conversion 

The  first  step  in  the  data  conversion  process  is  to  convert  the  seven  ASCII  files 
into  a  SAS  compatible  database,  done  via  the  program  CONVERT. SAS  (see  Appendix  B 
for  all  SAS  programs).  This  program  also  eliminates  two  problems  in  the  database.  First, 
the  value  4303  is  a  code  to  identify  “no  data”  and  does  not  represent  a  numerical  value. 
CONVERT.  SAS  replaces  all  these  entries  with  a  value  of  0.  Second,  some  test  dates  have 
no  corresponding  liver  function  test  results.  CONVERT.  SAS  deletes  these  entries. 

In  PHOENIX,  administering  each  new  test  results  in  a  new  entry  in  the  system.  As 
a  result,  a  single  SSAN  may  have  multiple  entries,  each  corresponding  to  a  different 
testing  date.  A  series  of  programs,  called  *RAW.SAS  (*  replaces  each  liver  function  test 
variable),  eliminates  the  multiple  observations  of  each  SSAN  by  putting  every  test  result 
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and  the  respective  test  date  on  a  single  hne.  This  is  done  for  each  liver  function  test 
variable.  The  maxirmim  number  of  observations  for  any  test  is  seven,  which  is  hard  coded 
in  the  programs,  the  variables  are  *1,  *2,  *3, . . .,  *1.  The  output  of  these  programs  are 
designated  *.RAW  files. 

The  final  step  is  to  convert  the  *.RAW  files  into  SAS  files  which  is  done  by 
MERGEALL.SAS.  This  program  also  combines  all  the  *.RAW  files  into  a  single  database 
called  HEALTH. WPAFB2.  This  workable  database  contains  174  variables  and  23 12 
subjects  (unique  SSANs). 

Classification  of  Work  Zone  Exposure  Areas 

In  order  to  monitor  common  exposures,  each  subject  is  assigned  a  work  zone. 
PHOENIX  tracks  these  work  zones  and  the  dates  in  which  a  subject  works  in  a  particular 
zone.  Work  zones  are  based  on  the  area  of  WPAFB  in  which  a  person  works,  either  A,  B, 
C,  or  K  (Kittyhawk),  the  building  number,  a  letter  for  identifying  common  exposures,  and 
a  number  for  further  breakdown  of  the  exact  common  exposures.  For  purposes  of  this 
analysis,  common  eiqiosure  areas  for  work  zones  are  based  on  the  area,  building  number, 
and  1st  letter  of  exposure.  This  decision  is  made  under  the  advisement  of  the  74th  SGPO. 
LUNGALL.SAS  classifies  the  zones  for  this  analysis.  This  study  uses  115  zones. 
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Dimensionality  Reduction 

Analysis  begins  after  the  creation  of  a  workable  database.  The  medical  hterature 
review  suggests  ALT  and  AST  as  the  primary  tests  of  interest  in  medical  surveillance  of 
the  liver.  By  applying  multivariate  data  reduction  techniques,  this  claim  may  be  supported. 
Two  multivariate  data  analysis  techniques  applied  in  this  study  are  principal  components 
analysis  and  factor  analysis.  Both  data  reduction  techniques  study,  explore,  and  hopefully 
simplify  the  iaterrelationships  among  the  set  of  variables.  Principal  components  analysis 
transforms  the  original  set  into  new  variables,  called  components,  which  are  uncorrelated 
linear  combinations  of  the  original  variables.  The  eigenvalues  of  the  correlation  matrix  of 
original  variables  determine  the  number  of  components  to  include  in  the  analysis.  In  this 
study,  we  employ  Kaiser’s  Criterion;  all  components  with  eigenvalues  greater  than  1.0 
will  be  considered  significant.  The  number  of  significant  principal  components  will 
determine  the  number  of  factors  that  will  be  used  in  the  factor  analysis. 

Factor  analysis  is  very  similar  to  principal  components  analysis.  Principal 
conqjonents  analysis,  explains  as  much  of  the  total  variation  as  possible  with  the  number 
of  components  selected,  while  factor  analysis  explains  the  interrelationships  (common 
variation)  among  the  original  variables  and  hopefully  reduce  the  number  of  variables  used 
through  the  factors. 


17 


Control  Charts 


After  employing  multivariate  techniques  and  determing  the  final  data,  investigation 
turns  to  actually  identifying  the  zones  with  high  proportions  of  liver  disease.  The 
technique  used  is  a  form  of  statistical  process  control  (SPC).  SPC  quickly  detects 
occurrences  (zones)  with  assignable  variability  (occupational  cause  of  liver  disease).  SPC 
relies  heavily  on  the  control  chart;  a  graphical  display  of  a  quality  characteristic  that  has 
been  measured  or  corqputed  fi’om  a  sample  versus  the  sample  number  (Montgomery, 

1991;  103).  This  application  uses  the  liver  function  test  results  as  the  quality  characteristic 
and  the  work  zone  for  the  sample  number.  The  chart  contains  a  center  line  representing 
the  average  value  of  the  liver  test  and  another  horizontal  line  called  an  upper  control  limit 
(UCL).  A  zone  in  control  plots  below  the  UCL,  for  all  but  a  preselected  percentage. 

Liver  test  results  from  in  control  zones  report  either  normal  or  abnormal  values  by  chance 
alone.  On  the  other  hand,  if  a  zone  has  an  unexpectingly  common  occmrence  of  high  test 
results,  it  will  plot  above  the  UCL.  Any  zone  outside  the  UCL  does  not  necessarily 
indicate  occupational  liver  disease,  but  signals  an  investigation  may  be  necessary. 

In  developing  control  charts,  a  number  of  their  attributes  must  be  addressed.  First, 
they  are  generally  based  on  a  ±  3a  away  from  the  mean  (3  standard  deviations  in  either 
direction  from  the  average).  This  accoimts  for  99.73%  of  the  observations  (imder  the 
normal  distribution).  Although  this  is  a  standard  practice,  we  used  the  established  lab 
normals  which  account  from  anywhere  between  almost  90%  and  over  99%  of  the 
observations  depending  on  the  test  (see  Table  4-2).  Secondly,  we  were  confronted  with 
varying  sample  sizes.  We  accoimted  for  the  variation  by  standardizing  the  results  and 
plotting; 
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;s  ^ 


where  Z. 
Pi 


=  the  plotted  statistic  (in  standard  deviation  units) 
=  (abnormal  people  in  zone  i)/sample  size 
=  probability  of  being  abnormal  (0.05) 

=  sample  size  for  zone  i 


Once  these  issues  have  been  addressed,  the  control  chart  proves  to  be  an  excellent  tool 
identifying  the  work  zones  where  occupational  liver  disease  may  be  a  problem. 


IV.  RESULTS 


Overview 

This  chapter  reports  all  the  findings  fi-om  this  analysis.  First,  we  approximated  the 
liver  fimction  test  en:q)irical  distributions.  From  these  distributions,  we  established  upper 
control  limits  fi’om  the  population  and  compared  them  to  those  established  by  the  74th 
SGPO.  Next,  we  reduced  the  data  set  using  multivariate  data  analysis  techniques 
(principal  conqtonent  analysis  and  factor  analysis).  Using  those  results,  we  subjected  the 
reduced  data  set  to  process  control  methods  where  we  identified  the  abnormal  zones  using 
three  different  criterion.  This  chapter  concludes  with  a  brief  summary  of  those  findings. 

Liver  Function  Test  Distributions 

This  analysis  produced  the  desired  product  of  the  empirical  distributions  for  each 
liver  function  tests  on  WPAFB  personnel.  To  achieve  this,  the  liver  test  scores  for  each 
SSAN,  between  one  and  seven  observations,  were  averaged  to  ensure  independence 
between  data  points.  BestFit  software,  tested  each  set  of  outcomes  against  18  families  of 
distributions  and  the  optimal  parameters  were  approximated.  To  prevent  biasing  the  fit  to 
the  distributions,  data  points  outside  the  expected  ranges  were  eliminated  (they  are 
considered  erroneous  data).  A  nmnber  of  the  liver  function  tests  were  well  approximated 
by  normal  distributions.  All  those  not  weU  approximated  by  normal  distributions  were 
positively  skewed.  By  transforming  them  to  natural  logarithms,  their  empirical 
distributions  were  approximately  normal.  Figure  4-1  contains  the  empirical  distributions 
of  the  tests  and  the  transformations.  These  transformations  enable  us  to  apply  later 
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statistical  methods  requiring  normally  distributed  data.  After  transformations,  all  tests  are 


well  approximated  by  normal  distributions,  based  on  Wilk-Shapiro  criteria  bsted  in  Table 
4-1. 


Analysis 

Eliminated 

Outhers 

Number  of 
Workers 

San^le 

Mean 

Sample  Std 
Deviation 

Wilk- 

Shapiro 

ALT 

none 

804 

26.10 

13.11 

0.8400 

AST 

514 

753 

23.08 

10.06 

0.6210 

GOT 

1601.7,  258 

555 

27.02 

18.97 

0.7000 

Bilirubin 

51.2 

620 

0.57 

0.30 

0.8433 

Albumin 

none 

622 

4.13 

0.32 

0.9203 

AP 

4725,  4266, 
893.8,  793.7,  1 

656 

77.57 

21.08 

0.9523 

In  ALT 

none 

804 

3.16 

0.46 

0.9717 

In  AST 

6.24 

753 

3.08 

0.31 

0.9237 

InGGT 

7.3788,  5.553 

555 

3.14 

0.53 

0.9743 

In  Bilirubin 

3.9357 

618 

-0.66 

0.47 

0.9809 

Table  4-1.  Liver  function  test  applied  to  normal  distributions. 


Upper  Control  Limits 

From  the  empirical  distributions  of  each  test,  we  determined  the  upper  end 
percentiles  for  the  population  used  for  study  and  related  them  to  the  estabhshed  upper 
control  limits.  To  do  so,  we  rank-ordered  the  observed  values  fi'om  smallest  to  largest 
and  picked  the  desired  percentile  directly  from  the  rank-ordered  hst.  Table  4-2  gives  the 
relevant  percentiles  of  the  population  studied. 
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Figure  4-1.  Distributions  of  liver  function  tests. 
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Lab  Upper 
Limit 

Percentiles 

Test 

90th 

95th 

97.5th 

99th 

40 

42 

49 

70 

37 

31 

44 

68 

GGT 

50 

47 

63 

79 

132 

BR 

1 

1 

1.1 

1.4 

1.6 

5.0 

4.5 

4.6 

4.7 

4.9 

AP 

117 

88 

92 

94 

96 

Table  4-2.  90th  through  99th  percentiles  of  workers. 


Table  4-2,  demonstrates  the  normal  limits  used  and  established  by  the  74th  SGPO 
vary  from  somewhere  below  the  90th  percentile  (ALT)  to  above  the  99th  percentile 
(Albumin  and  AP)  based  on  the  population  for  this  study.  Ideally,  all  tests  should  use  the 
same  percentile,  say  95th,  for  classifying  as  a  normal  or  abnormal  readmg.  We  used  the 
hospital  lab  upper  limits  requested  by  the  74th  SGPO. 

Multivariate  Analysis 

The  medical  hterature  review  suggested  ALT  and  AST  as  the  primary  tests  of 
interest  in  medical  surveillance  of  the  liver.  By  applying  multivariate  data  reduction 
techniques,  this  claim  may  be  supported.  Two  multivariate  data  analysis  techniques 
apphed  in  this  study  were  principal  conoponents  analysis  and  factor  analysis. 

For  this  portion  of  the  study,  we  only  used  an  observation  if  all  six  variables  were 
recorded  on  the  date  the  liver  test  was  administered.  Principal  components  analysis  and 
factor  analysis  require  the  same  nmnber  of  observations  for  each  variable.  As  a  result,  424 
individuals  with  all  six  liver  function  test  results  are  used  in  the  multivariate  portion  of  the 
analysis. 


23 


The  main  objective  of  this  analysis  was  to  reduce  the  dimensionality  of  the  six  liver 
tests  to  two  or  three  dimensions  which  can  help  explain  the  imderlying  cormnunahty  (how 
each  variable  covaries  with  the  factors)  of  the  tests.  It  is  hoped  factor  score  plots  will 
reveal  regions  of  normal  and  abnormal  scores.  The  six  variables;  ALT,  AST,  GGT, 
bilimbin,  albumin,  and  AP  determine  the  six  dimensions  of  the  data  set.  Table  4-3  is  a 
summary  of  our  normality  tests  using  the  424  observation  subset  on  these  variables  using 
the  Wilk- Shapiro  statistic. 


Variable 

Wilk-Shapiro  Statistic 
of  Variable 

ALT 

0.8817 

0.9887 

AST 

0.7349 

0.9538 

GGT 

0.6830 

0.9686 

Bilirubin 

0.9652 

Albumin 

0.9565 

AP 

0.9483 

0.9739 

Table  4-3.  Wilk-Shapiro  statistics. 


For  this  study,  a  WUk-Shapiro  value  of  0.9  or  higher  was  considered  acceptable  for  an 
approximation  of  normally  distributed  data.  Even  though  the  log  transformations  for 
albumin  and  AP  in:q)rove  the  normaUty,  the  improvement  appears  nominal.  Therefore,  we 
used  the  hi(ALT),  hi(AST),  In(GGT),  In(BR),  albumin,  and  AP  for  the  principal 
conq)onents  analysis.  By  using  these  transformations  in  place  of  the  original  variables 
(using  data  with  approximately  normal  distributions),  the  first  two  eigenvalues  explain 
about  three  percent  more  of  the  variance. 

Using  the  In(ALT),  In(AST),  In(CjGT),  In(BR),  albumin,  and  AP  values  we 
obtained  the  correlation  matrix  used  for  the  principal  components  analysis  (see  Table  4-4). 
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Three  data  points  had  a  bilirubin  value  equal  to  zero;  they  were  deleted  from  the  data  set 
since  the  ln(0)  does  not  exist. 


LnALT 

LnAST 

LnGGT 

LnBR 

AP 

Albumin 

LnALT 

1.0000 

0.6495 

0.5027 

-.0204 

0.1131 

0.2494 

LnAST 

0.6495 

1.0000 

0.2516 

-.0038 

0.0807 

0.1526 

LnGGT 

0.5027 

0.2516 

1.0000 

-.0968 

0.2337 

0.1748 

LnBR 

-.0204 

-.0038 

-.0968 

1.0000 

-.1342 

0.0695 

AP 

0.1131 

0.0807 

0.2337 

-.1342 

1.0000 

0.0177 

Albumin 

0.2494 

0.1526 

0.1748 

0.0695 

0.0177 

1.0000 

Table  4-4.  Correlation  matrix. 


From  the  correlation  matrix,  we  obtained  the  eigenvalues  which  can  be  used  to 
calculate  the  amount  of  variance  each  of  the  components  explain;  the  more  variance 
explained,  the  better  (Table  4-5).  Using  Kaiser’s  criterion,  only  two  princ^al  components 
were  suggested  to  be  used  in  this  analysis.  Although  the  third  principal  component  has  an 
eigenvalue  close  to  1.0,  but  adhered  to  the  criterion  of  only  accepting  value  above  1.0 
estabhshed  before  the  study  began.  The  first  two  components  explain  about  55%  of  the 
total  variation  in  the  data. 


Eigenvalue 

Difference 

Proportion 

Cumulative 

PRINl 

2.11392 

0.938320 

0.352320 

0.35232 

PRIN2 

1.17560 

0.270290 

0.195933 

0.54825 

PRIN3 

0.90531 

0.065253 

0.150885 

0.69914 

PRIN4 

0.84005 

0.157905 

0.140009 

0.83915 

PRIN5 

0.68215 

0.399177 

0.113692 

0.95284 

PRIN6 

0.28297 

0.047162 

1.00000 

Table  4-5.  Eigenvalues  of  the  correlation  matrix. 
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PRINl 

PRIN2 

PRIN3 

PRIN4 

PRIN5 

PRIN6 

LnALT 

0.606582 

0.123107 

-.216586 

0.026185 

-.024660 

-.754122 

Ln  AST 

0.517509 

0.180599 

-.428703 

0.097185 

0.441817 

0.557795 

LnGGT 

0.485061 

-.200184 

0.165009 

0.050529 

-.762522 

0.336780 

LnBR 

-.063085 

0.673889 

0.262303 

0.678808 

-.110395 

0.011112 

AP 

0.212882 

-.585046 

0.462072 

0.487615 

0.397880 

-.053061 

Albiunin 

0.282242 

0.340220 

0.678058 

-.537372 

0.228591 

0.061688 

Table  4-6.  Matrix  of  eigenvectors. 


By  Kaiser’s  criterion,  the  data  is  reduced  to  two  dimensions  with  the  first 
dimension  (PRINl)  being  characterized  by  hi(ALT),  In(AST),  and  hi(GrGT)  and  the 
second  dimension  (PRIN2)  being  characterized  by  AP  and  hi(bi]irubin).  Albumin 
dominates  PRIN3  with  a  corresponding  eigenvalue  of  0.9.  Adding  this  dimension  would 
then  explain  over  70%  of  the  total  variation.  Although  this  is  a  vahd  case  for  including  the 
third  component,  the  albumin  test  has  a  different  clinical  interpretation  than  the  other  five 
variables.  Plus,  fi’om  an  analytical  view  point,  it  is  better  to  use  the  single  variable  albumin 
(independently)  instead  of  the  third  principal  component.  Since  the  component  is  less  than 
1.0  and  only  heavily  influenced  by  the  single  variable,  the  variable  should  be  used  if  the 
information  in  protrays  is  important  enough.  At  this  point,  we  kept  albumin  in  the  data 
set,  but  adhered  to  Kaiser’s  criterion  and  examined  only  the  first  two  dimensions  in  the 
factor  analysis.  Later,  the  effects  of  removing  albumin  were  also  examined.  The  factor 
pattern  of  the  initial  factor  analysis  using  the  principal  components  procedure  above  is  in 
Table  4-7. 

The  factors  appear  interpretable;  Factor  1  deals  primarily  with  tests  measuring  the 
direct  health  of  the  Uver  (how  many  fiver  cells  are  damaged  or  dying)  while  factor  2  deals 
with  the  congestion  within  the  fiver  function.  This  supports  the  distinctions  made  by 
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Factor  1 

Factor  2 

Communahty 

LnALT 

0.88193 

0.13348 

0.795616 

LnAST 

0.75242 

0.19581 

0.604483 

LnGGT 

0.70525 

-0.21705 

0.544481 

LnBR 

-0.09172 

0.73066 

0.542284 

AP 

0.30952 

-0.63434 

0.498182 

Albumin 

0.41036 

0.36888 

0.304471 

Variance  Explained 

2.113918 

1.175598 

Final  Communahty  Estimate:  Total  =  3.289517 

Table  4-7.  Factor  pattern. 


Douidar,  1992,  who  stated  ALT,  AST,  and  GGT  all  represent  loss  ofhepatocyte  cellular 
integrity  and  BR  and  AP  measure  cholestatic  functioning.  Although  the  factors  are 
interpretable,  a  varimax  rotation  was  applied  to  make  these  loadings  easier  to  interpret  and 
more  clear.  This  transformation  is  used  to  find  new  axes  in  the  two  dimensional  space  to 
represent  the  factors.  The  new  axes  we  determined  by  maximizing  the  sum  of  the 
variances  of  the  squared  factor  loadings  within  each  factor  and  adjusting  them  by  dividing 
by  the  communahties  which  correspond  of  these  variables.  The  orthogonal  transformation 
matrix  which  accomphshed  this  rotation  is  in  Table  4-8. 


1 

2 

1 

0.96476 

-0.26135 

2 

0.26315 

0.96476 

V 

Table  4-8.  Orthogonal  transformation  matrix  for  varimax  rotation. 


We  get  the  factor  pattern  in  Table  4-9  after  the  varimax  rotation  which  gives  us  the  same 
interpretation  as  before  with  shghtly  more  distinction  between  the  factors.  One  point  of 
interest  is  the  sign  difference  between  In(BR)  and  AP  in  factor  2.  This  contrast  stems 
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from  the  nature  of  the  tests  and  what  they  measure;  a  rise  in  bilirubin  results  when  there  is 
excessive  red  blood  cell  destruction  in  the  liver  while  AP  (a  protein)  production  is 
decreased  in  a  dysfunctional  liver.  The  negative  correlation  is  expected  since  bilirubin 
increases  and  albumin  decreases  with  liver  damage. 


Factor  1 

Factor  2 

Communahty 

LnALT 

0.88597 

-0.10330 

0.795616 

LnAST 

0.77743 

-0.00908 

0.604483 

LnGGT 

0.62327 

-0.39498 

0.544481 

LnBR 

0.10378 

0.72905 

0.542284 

AP 

0.13168 

-0.69343 

0.498182 

Albumin 

0.49297 

0.24790 

0.304471 

Variance  Explained 

2.048943 

1.240573 

Table  4-9.  Factor  pattern  after  varimax  rotation. 


Each  factor  score  was  estimated  by  a  linear  combination  of  standardized  values  of  the  six 
variables.  The  standardized  scoring  coefficients  are  in  Table  4-10.  Using  these 
coefifiecients,  each  observation  was  given  a  factor  1  and  factor  2  score.  Utilizing  the 
normal  values  for  liver  function  tests  estabhshed  by  the  74th  SGPO  (Table  2-3), 
individuals  were  classified  as  normal  or  abnormal.  For  plotting  purposes,  those  with 
abnormal  readings  were  put  into  one  of  two  categories.  Abnormal  results  in  any  of  the 
tests  primarily  contributing  to  factor  1  (ALT,  AST,  and  GGT)  were  combined  as  were  the 
individuals  with  abnormal  results  in  tests  primarily  contributing  to  factor  2  (BR  and  AP). 


Factor  1 

Factor  2 

LnALT 

0.43238 

-0.00025 

LnAST 

0.38722 

0.06703 

LnGGT 

0.27328 

-0.26591 

LnBR 

0.12169 

0.61104 

AP 

-0.00073 

-0.55910 

Albumin 

0.26985 

0.25164 

Table  4-10.  Standardized  scoring  coefficients. 
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Since  albumin  did  not  factor  heavily  in  the  scores,  an  abnormal  albumin  reading  was  not 
plotted.  The  factor  scores  for  normal  and  abnormal  individuals  are  plotted  in  Figure  4-2. 


Factor  1  by  Factor  2  Plot 

□ 

□ 

□ 

□ 

4 

4  A  D  *  A  D  ° 

n  A  ^40  OO 

Oo 

A.  A  “a  *  ^  0 

sjT  o  ^ 

A4  0 

0  alt,  ast,  ggt 

A 

□  br  or  ap 

□ 

4  normal 

Figure  4-2.  Plot  of  factor  1  scores  and  factor  2  scores. 

Figure  4-2  shows  four  regions;  normal  subjects,  abnormaUties  corresponding  to 
component  1,  and  two  regions  of  abnormaUties  corresponding  to  component  2. 
Component  1  is  predominately  associated  with  high  factor  1  scores  (>  1.1)  and  nearly 
uniform  with  respect  to  factor  2  scores.  On  the  other  hand,  component  2  is  predominately 
in  the  region  of  high  or  low  factor  2  scores  and  uniform  with  respect  to  factor  1.  Those 
low  factor  2  scores  are  not  a  concern  because  they  do  not  represent  Uver  dysfunction. 
Figure  4-2  not  only  illustrates  the  regions  of  normaUty  and  abnormaUty,  but  also 
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demonstrates  the  distinction  between  the  two  factors  and  their  respective  components 
(ALT,  AST,  and  GGT  are  associated  with  quaUty  of  the  cellular  function  while  BR  and 
AP  are  associated  with  congestion  in  the  liver).  Smce  albumin  measures  something 
independent,  it  was  not  included  in  Figure  4-2.  However,  keeping  albumin  in  the  data  set 
may  have  caused  a  confounding  effect  and  had  undue  influence  on  the  scores.  One  of  the 


reasons  for  accomphshing  factor  analysis  is  to  determine  what  variables  are  important  and 
albumin  did  not  appear  important  for  what  we  wanted  to  measure.  Independent  studies  by 
Kremer,  1994,  Lundberg,  1994,  and  Tamburro,  1981,  support  using  ALT,  AST,  GGT, 
BR,  and  AP  (eliminating  albumin).  Therefore,  we  considered  the  contribution  from 
albumin  to  be  irrelevant  information  for  this  study  and  dropped  it  from  the  original  data  set 
and  performed  the  analysis  again.  The  results  using  the  remaining  five  variables  are  found 
in  Tables  4-11  through  4-14  and  Figure  4-3. 

The  analysis  for  the  five  variable  case  paralleled  the  six  variable  case,  but  with 
stronger  evidence  that  ALT,  AST,  and  GGT  are  interrelated  and  separate  from 


Difference 

Proportion 

Cumulative 

Prin  1 

2.0187 

0.8817 

0.4037 

0.4037 

Prin2 

1.1370 

0.2719 

0.2274 

0.6311 

Prin3 

0.8650 

0.1711 

0.1730 

0.8041 

Prin  4 

0.6939 

0.4085 

0.1388 

0.9429 

Prin  5 

0.2854 

0.0571 

1.0000 

Table  4-11.  Eigenvalues  of  the  correlation  matrix  with  five  input  variables, 
mterrelated  BR  and  AP.  Figure  4-3  shows  similar  results  to  Figure  4-2,  but  the  regions 
are  more  pronounced.  There  appears  to  be  a  fairly  good  distinction  in  the  factor  1  scores 
in  measuring  the  health  of  the  liver  as  abnormal  or  normal.  A  similar  break  exists  with 
respect  to  the  congestion  measure  on  the  factor  2  score  scale.  However,  this  distinction  is 
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not  as  clear  as  in  factor  1.  The  conclusions  drawn  from  the  factor  analysis  allowed  us  to 
classify  the  factors  based  on  what  their  respective  flmction:  factor  1  can  be  called  a 
Transferase  Index  and  factor  2  is  a  Liver  Congestion  Index. 


Before  ] 

notation 

After  Varimax  Rotation 

Factor  1 

Factor  2 

Factor  1 

Factor  2 

Communality 

LnALT 

0.88382 

0.22600 

0.91172 

0.03129 

0.832215 

LnAST 

0.76797 

0.32739 

0.82901 

-0.09855 

0.696972 

LnGGT 

0.71672 

-0.18362 

0.63555 

0.38009 

0.548399 

LnBR 

-0.13360 

0.72395 

0.07508 

-0.73234 

0.541952 

AP 

0.34093 

-0.64998 

0.14524 

0.71764 

0.536105 

Var  Explained 

2.018690 

1.136953 

1.949155 

1.206489 

Table  4-12.  Factor  patterns  before  and  after  rotation  for  five  variable  case. 


1  2 

~T 

2 

0.95976  0.28082 

0.28082  -0.95976 

Table  4-13.  Orthogonal  transformation  matrix  for  five  variable  case. 


Factor  1 

Factor  2 

LnALT 

0.47602 

-0.06783 

LnAST 

0.44599 

-0.16954 

LnGGT 

0.29473 

0.25699 

LnBR 

0.11529 

-0.62971 

AP 

0.00204 

0.59442 

Table  4-14.  Standardized  scoring  coefficients  for  five  variable  case. 


31 


Factor  1  by  Factor  2  Plot  Using  Five  Variables 


14 
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Figure  4-3.  Factor  1  by  Factor  2  plot  using  five  variables. 


The  use  of  these  indicies  as  screening  metrics  had  promising  potential.  However, 
the  74th  SGPO  desired  a  single  easy  measure  for  the  screening  of  liver  disease. 
Therefore,  we  piusued  further  data  reduction.  Occupational  exposures  causing  liver 
disease  primarily  impact  the  inputs  to  the  Transferase  Index,  made  up  of  the  natural  logs 
of  ALT,  AST,  and  GGT.  Therefore,  we  en:q)hasized  these  tests  for  further  examination 
and  eliminated  BR  and  AP  from  consideration  as  screening  tools  for  liver  disease. 
Although  eliminating  data  has  potential  adverse  consequences,  our  goal  was  to  find  the 
easiest  efficient  metric  possible  and  eliminating  data  that  is  not  is  as  meaningful  helped 
accomphsh  this  end.  To  further  understand  the  relationships  between  ALT,  AST,  and 
GGT,  we  performed  another  factor  analysis  on  just  these  three  variables  to  determine  the 
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imderlying  commimality  of  these  tests  (an  attempt  to  reduce  three  dimensional  data  to  one 
dimension);  1  factor  was  retained  by  Kaiser’s  criterion.  The  results  are  summarized  in 
Tables  4-15,  4-16,  and  Figure  4-4. 


Eigenvalue 

Difference 

Proportion 

Cumulative 

Prin  1 

1.9424 

1.1744 

0.6475 

0.6475 

Prin2 

0.7680 

0.4783 

0.2560 

0.9034 

PrinS 

0.2897 

0.0966 

1.0000 

Table  4-15.  Eigenvalues  of  the  correlation  matrix  using  ALT,  AST,  and  GGT. 


Factor  1 

Commimality 

LnALT 

0.91085 

0.829652 

0.46894 

Ln  AST 

0.80399 

0.646400 

0.41392 

LnGGT 

Variance  Explained 

0.68287 

1.942368 

0.466316 

0.35157 

Final  Communahty  Estimates:  Total  = 

1.942368 

Table  4-16.  Results  from  factor  analysis  on  ALT,  AST,  and  GGT. 
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The  natoal  log  of  ALT  is  highly  correlated  with  the  resulting  factor  score  at  0.91 
which,  not  surprisingly,  is  seen  in  Figure  4-4.  Figure  4-4  shows  if  the  ALT  reading  is 
abnormal,  we  should  also  have  an  abnormal  Transferase  Index.  Although  there  are  some 
missed  observations  (where  the  Transferase  Index  indicates  abnormality  but  there  is  a 
normal  ALT  reading),  the  overall  trend  is  convincing.  This  supports  the  literature  which 
states  the  ALT  measurement  is  the  most  useful  tool  in  screening  for  occupational  health 
hazards.  Abnormal  factor  1  socres  with  normal  ALT  scores  appear  to  be  related  to  the 
lack  of  specificity  found  in  the  GGT  test.  Therefore,  ALT  is  a  reasonable  sole  indicator  of 
the  liver  disease  the  74th  SGPO  is  trying  to  identify. 

Control  Charts 

Multivariate  analysis,  indicated  a  combination  of  ALT,  AST,  and  GGT  were  the 
optimal  liver  function  test  battery.  Further,  ALT  by  itself  is  a  respectable  indicator  as  a 
screening  test  for  liver  disease.  These  liver  tests  were  applied  to  statistical  process  control 
(SPC)  techniques  and  resulting  control  charts  (graphical  displays  of  liver  test  results 
versus  work  zones)  were  constructed.  Due  to  the  nature  of  the  data  set,  separate  control 
charts  were  made  for  each  year. 

Three  sets  of  control  charts  were  developed.  The  first  set  used  aU  the  inputs  in  the 
Transferase  Index.  Due  to  inconsistant  data  collection  from  year  to  year ,  the  actual 
Transferase  Index  could  not  be  appHed.  Therefore,  if  an  individual  had  at  least  one  test 
(ALT,  AST,  or  GGT)  above  the  estabUshed  upper  limits  (see  Table  2-3),  they  were 
classified  as  abnormal  for  that  year.  Every  person  was  then  put  in  their  respective  work 
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zone  and  the  zones  were  standardized  based  on  saniple  size.  To  standardize,  the  variable 


plotted  is: 


Z,  = 


Pi 


\p{\-  p) 


n , 


where  Z^  =  the  plotted  statistic  (in  standard  deviation  units) 
P  =  (abnormal  people  in  zone  i)/sample  size 
p  -  probabihty  of  being  abnormal  (0.05) 

«,  =  sample  size  for  zone  i 


The  observations  were  arranged  in  descending  order  by  standardized  score  to  avoid  any 
sense  of  time  series  from  one  observation  (zone)  to  the  next.  An  example  of  a  control 
chart  is  in  Figure  4-5. 


1991  Standardized  ControlCbart  for  Tram ferateindex 


Figure  4-5.  1991  standardized  control  chart  for  Transferase  Index  inputs. 


Figure  4-5  shows  18  zones  above  the  UCL  of  3.0  standard  deviations  in  1991.  The  case 
nrunbers  which  plot  above  the  acceptable  limits  correspond  to  specific  zones  (which  can 
be  read  off  a  chart);  using  case  numbers  sm^lified  the  chart.  1991  was  chosen  because 


35 


the  data  in  that  year  was  the  most  extensive.  Control  charts  were  made  for  each  year  and 
a  summary  table  for  this  set  of  control  charts  is  in  Table  4-17.  The  summary  table  is  used 
to  eliminate  looking  up  case  numbers  in  a  table.  We  also  examined  the  effects  of  multiple 
abnormal  scores  for  the  same  individual  and  no  conclusive  findings  were  made. 

The  second  set  of  control  charts  used  a  different  criterion  for  abnormality.  At  the 
request  of  the  74th  SGPO  (and  supported  by  the  multivariate  analysis),  only  ALT  was 
considered.  If  an  individual  had  an  abnormal  test  result  in  a  given  year,  they  were 
classified  as  abnormal.  The  remainder  of  the  procedme  followed  that  of  the  first  set  of 
control  charts.  A  summary  table  for  this  set  of  control  charts  is  foimd  in  Table  4-18. 

The  natural  question  becomes  “which  criterion  is  the  better  alternative?”  Table  4- 
19  shows  a  comparison  between  the  two  alternatives.  Every  shaded  area  is  year  when  the 
respective  zone  was  identified  as  out  of  control  by  the  Transerferase  Index  inputs 
criterion.  A  “hit”  signifies  where  the  ALT  criterion  also  found  that  zone  to  be  out  of 
control  and  a  “miss”  indicates  in  control  by  the  ALT  criterion.  It  should  also  be  noted  that 
the  data  for  1994  and  1995  rely  heavily  on  ALT;  AST  and  GGT  results  are  scarce  \\hich 
may  skew  the  comparison  of  the  two  criterion. 

Lastly,  a  series  of  demerit  charts  were  produced.  A  demerit  system  was  employed 
to  account  for  varying  degrees  of  abnormahty.  Not  all  abnormal  test  results  are  equally 
hiq)ortant  since  a  zone  with  three  or  four  individuals  moderately  above  normal  is  not  as 
great  of  a  concern  as  a  zone  with  two  or  three  severe  cases.  Therefore,  each  abnormal 
test  was  assigned  to  a  class  according  to  severity.  Each  class  represents  a  standard 
deviation  further  away  fi'om  the  mean.  For  the  men,  the  standard  deviation  (a  =  13.11) 
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Zone 


A278A 


A830J 


A830Q 


A878A 


A894A 


B18B 


B18C 


B33A1 


B36D 


B433A 


B5G1 


B6000 


Occupational  Grou 


Pest  Management 


Hospital  -  Hyperbarics 


Hospital  -  Hematology/Qncolo 


Golf  Course 


Golf  Course  (Twin  Base 


WL  -  Experimental  Research 


WL  -  Experimental  Research 


Accel  Efif 


Heat  Distribution 


Navy  Toxicolo 


tics  Warfare 


WL  -  Experimental  Support 


ASC  -  Production  Control 


Fire  Department  (Page  Manor 


1991  1992 


B620N 

WL  -  Electronic  Warfare 

B640B 

AFIT/ENP  Physics 

B652B 

WL  -  Materials  &  Surf  Interaction 

B654B 

WL  -  Polymer  Branch 

B655C 

WL  -  Nondestructive  Eval 

B743A 

DRMO 

B76A1 

Fire  Departments  #3  &  #6 

B79C 

AL  -  Hazard  Assessment 

B79E 

AL  -  Hazard  Assessment 

B824B 


Machine  Sho 


B838A 

Occ.  Env.  Vet  Medicine 

C13R 

Aircraft  Structural  Maintenance 

Fire  Departments  #1,  #2,  &  #5 

Aircraft  Modification 

C70A 

AFOSI  Tech  Svcs 

C71A1 

Packing  and  Crating 

C89B 

Environmental  Management 

C91B1 

Fuel  Systems 

Number  of  Zones 

Abnormal  Count 

Percent  Abnormal 

3.18 

4.36 

4.36 

4.05 

4.50 

4.36 

4.36 

4.22 

38 

18 

6 

24.00 

15.79 

23  24 


).81  12.50  16.67 


Table  4-17.  Standardized  scores  of  zones  above  upper  control  Umits  based  on 
abnormal  ALT,  AST,  or  GGT  tests  (WL  =  Wright  Labs,  AL  =  Armstrong  Labs). 
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Fire  Departments  #3  &  #6 


AL  -  Hazard  Assessment 


AL  -  Hazard  Assessment 


Occ  Env  Vet  Medicine 


Fire  Departments  #1,  #2,  &  #5 


AFOSI  Tech  Svcs 


Packine  and  Cratin 


Environmental  Management 


Fuel  Systems 


Aircraft  Modification 


Number  of  Zones 


Abnormal  Count 


Percent  Abnormal 


62 

1 


1.61  17.33 


Table  4-18.  Standardized  scores  of  zones  above  upper  control  limits  using  abnormal 
ALT  criterion  (WL  =  Wright  Labs,  AL  =  Armstrong  Labs). 


was  rounded  up  to  14  since  aU  the  ALT  readings  were  integers.  Further,  the  mean 
(26.096)  plus  la  (14)  coincided  with  the  upper  control  limit  estabUshed  by  the  74th 
SGPO.  A  similar  situation  occurred  with  the  female’s  ranges.  The  scheme  we  used  was 
the  following  (female  values  in  parentheses): 
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Zone 

Occupational  Group 

A278A 

Pest  Management 

A830J 

Hospital  -  Hyperbar ics 

A830Q 

Hospital  -  Hematology /Oncology 

A878A 

Golf  Course 

A894A 

Golf  Course  (Twin  Base) 

B18B 

WL  -  Experimental  Research 

B18C 

WL  -  Experimental  Research 

B33A1 

Accel  Eff 

B36D 

Heat  Distribution 

B433A 

Navy  Toxicology 

B4D 

WL  -  Electro-Optics  Warfare 

B490A 

WL  -  Experimental  Support 

B5G1 

ASC  -  Production  Control 

B6000 

Fire  Department  (Page  Manor) 

B620C 

WL  -  Systems  Integration 

B620N 

WL  -  Electronic  Warfare 

B640B 

AFIT/ENP  Physics 

B652B 

WL  -  Materials  &  Surf  Interaction 

B654B 

WL  -  Polymer  Branch 

B655C 

WL  -  Nondestructive  Eval 

B743A 

DRMO 

B76A1 

Fire  Departments  #3  &  #6 

B79C 

AL  -  Hazard  Assessment 

B79E 

AL  -  Hazard  Assessment 

B824B 

Machine  Shop 

B838A 

Occ.  Env.  Vet  Medicine 

C13R 

Aircraft  Structural  Maintenance 

C163A 

Fire  Departments  #1,  #2,  &  #5 

C206E 

Aircraft  Modification 

C70A 

AFOSI  Tech  Svcs 

C71A1 

Packing  and  Crating 

C89B 

Environmental  Management 

C91B1 

Fuel  Systems 

1990  1991 


tsiss 


1993  I  1994  1995 


TOTAL  HITS 


TOTAL  MISSES 


PERCENT  HITS 


Table  4-19.  Comparison  of  ALT,  AST,  GGT  criterion  and  ALT  criterion  (WL  ■ 
Wright  Labs,  AL  =  Armstrong  Labs). 
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Class 

Test  Range 

Approximate  Standard 
Deviation  Range 

Class  A  Abnormahties 

82  or  >  (69  or  >) 

>  4a 

Class  B  Abnormahties 

68-  81  (56-  68) 

3a  -  4a 

Class  C  Abnormahties 

54-67  (43  -  55) 

2a-  3a 

Class  D  Abnormahties 

41  -  53  (32-42) 

la  -  2a 

Let  Ca,  Cb,  Cc,  and  Cd  represent  the  number  of  Class  A,  Class  B,  Class  C,  and  Class  D 
abnormalities,  respectively,  in  a  particular  zone.  We  assumed  each  class  of  defects  was 
independent,  and  the  occurrences  in  each  class  were  well  modeled  by  a  Poisson 
distribution.  Then  we  defined  the  nximber  of  demerits  in  that  zone  as 

D  =  IOOca  +  50cb  +  lOcc  +  Cd 

The  demerit  weights  of  Class  A  -  100,  Class  B  -  50,  Class  C  -  10,  and  Class  D  -  1  are 
used  fairly  widely  in  practice  (Montgomery,  1991;  186). 

Suppose  a  zone  had  n  individuals  it.  Then  the  number  of  demerits  per  individual, 


M,  is: 


D 

u  =  — 
n 

where  D  is  the  total  number  of  demerits  in  the  entire  zone.  Since  m  is  a  linear  combination 
of  independent  Poisson  random  variables,  Montgomery  suggests  plotting  statistics  u  on 
control  charts  with  the  following  parameters: 


UCL  =  M  +  3(T„ 
Center  line  =  u 


where 


and 


u  =  lOOi/^  +  50Wg  +  10M(,  +  Up 
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|l/2 


<y,.  = 


(100)  +  (50)  Mg  +  (10)  U(,  +  Mp 


n 


Ua^Ub,UcUd  represent  the  average  number  of  Class  A,  Class  B,  Class  C,  and  Class  D 


abnormalities  per  individual.  From  these  calculations,  control  charts  were  made  for  each 
year.  The  1991  Demerit  Control  Chart  is  in  Figure  4-6  and  a  summary  chart  is  in  Table  4- 


20. 


Figure  4-6.  1991  demerit  control  chart. 


Summary 

This  study  answered  the  questions  initially  posed  by  the  74th  SGPO.  It  first 
looked  into  the  distributions  of  the  tests  they  use  for  monitoring  occupational  liver  disease 


Zone 

Occupational  Group 

1990 

1991 

1992 

1993 

1994 

1995 

Zone  Sum 

A278A 

Pest  Management 

20 

IB 

51 

173 

A830A 

Hospital 

1 

IB 

1 

3 

A830Q 

Hospital  -  Hematology 

1 

200 

A878A 

Golf  Course 

1 

1 

4 

A894A 

Golf  Course  (Twin  Base) 

1 

162 

B145A 

Control  Instrum  &  Assess 

3 

3 

B33A 

Accel  EfiF 

51 

B433 

Navy  Toxicology 

2 

261 

1 

264 

50 

mi 

WL  -  Experiment  Support 

112 

112 

B6000 

Fire  Dept  (Page  Manor) 

12 

102 

111 

1 

3 

B620F 

SoMd  State  Electronics 

12 

B640B 

AFIT/ENP  Physics 

WiBI 

100 

B652D 

WL  -  Material  &  Surfaces 

10 

10 

B654B 

WL  -  Polymer  Branch 

301 

B65A 

10 

10 

B682A 

Lib  Cong  -  Mot  Pic 

11 

B76A1 

Fire  Departments  #3  &  #6 

64 

2 

103 

4 

275 

B76A2 

Fire  Departments  #3  &  #6 

1 

1 

1 

3 

B79A 

AL  -  Hazard  Assessment 

62 

100 

212 

B79C 

AL  -  Hazard  Assessment 

11 

11 

B79E 

AL  -  Hazard  Assessment 

10 

10 

20 

C13R 

Aircraft  Stuctural  MX 

2 

1 

5 

Aircraft  Generation  Branch 

■E 

10 

Fire  Dept  #1,  #2,  &  #5 

43 

127 

1145 

Aircraft  Modification 

200 

100 

10 

2 

323 

C4020A 

Fuel  Systems 

1 

1 

53 

C4021E 

AGE 

■E 

10 

C89B 

Environmental  Mgt 

3 

■DD 

50 

64 

Fuel  Systems 

■E 

111 

121 

Total 

528 

748 

777 

522 

Table  4-20.  Summary  of  demerit  control  charts. 


AH  six  test,  either  in  their  original  form  or  through  a  log  transformation,  can  be 
approximated  using  the  normal  distribution.  We  also  examined  the  upper  control  limits 
they  estabhshed  and  found  some  inconsistencies  where  the  limits  were  set  based  on 
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percentiles  of  the  population.  Through  multivariate  analysis,  we  were  able  to  eliminate 
some  of  the  tests  used  in  the  past  for  screening.  We  found  that  ALT,  AST,  and  GGT 
(Transferase  Index)  are  sufficient  in  examining  liver  disease  for  their  apphcation  and  BR, 
AP,  and  albumin  need  not  be  used.  Support  was  also  found  for  the  74th’s  decision  to  use 
just  ALT  as  the  primary  liver  function  screening  test.  Based  on  three  different  criterion, 
including  a  demerit  system  to  weight  severity,  we  identified  five  work  zones  on  WPAFB 
where  liver  disease  appears  to  have  been  a  severe  problem  over  the  past  six  years 
(summary  in  Table  4-21).  With  this  knowledge  in  hand,  the  74th  SGPO  is  equipped  to 
concentrate  efforts  in  the  diminution  or  possible  elimination  of  occupational  liver  disease 
at  WPAFB. 


Zone 

Occupational 

Group 

Transferase 

Criterion 

Average 

ALT 

Criterion 

Average 

Demerit 

Score 

C163A 

Fire  Departments  #1,  #2,  &  #5 

5.47 

5.03 

B6000 

Fire  Department  (Page  Manor) 

5.43 

5.5 

C206E 

Aircraft  Modification 

4.7 

4.88 

B654B 

WL  -  Polymer  Branch 

3.18 

B76A1 

Fire  Departments  #3  &  #6 

4.06 

4.28 

275 

B433A 

Navy  Toxicology 

4.07 

3.3 

264 

A830Q 

Hospital  -  Hematology/Qncology 

4.13 

4.13 

201 

A894A 

Golf  Course  (Twin  Base) 

4.36 

4.36 

162 

C91B1 

Fuel  Systems 

3.59 

3.59 

121 

B490A 

WL  -  Experimental  Support 

4.5 

4.5 

112 

C89B 

Environmental  Management 

4.22 

64 

B79E 

AL  -  Hazard  Assessment 

4.36 

B79C 

AL  -  Hazard  Assessment 

3.59 

3.59 

11 

Table  4-21.  Summary  of  liver  disease  “hot-spots.”  Zones  identified  by  all  three 

criterion  (listed  by  demerit  scores). 
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V.  FINAL  REMARKS  AND  FOLLOW-ON  WORK 


Final  Remarks 

The  analysis  conducted  in  this  study  was  designed  to  be  a  screening  tool  for  the 
74th  SGPO  and  help  them  identify  zones  with  abnormal  occurrences  of  liver  disease. 
Based  on  the  data  and  information  available  we  developed  a  method  to  help  detect 
abnormal  zones  enabling  them  concentrate  efforts  in  the  removal  of  occupational  toxins 
causing  liver  disease.  However,  the  study  was  not  without  areas  of  concern  and  possible 
improvement. 

First,  the  PHOENIX  database  was  extremely  difficult  to  work  with  and  produce 
meaningful  results.  At  the  time  of  this  study,  a  new  system  was  in  the  process  of  coming 
on-line.  Hopefully,  it  will  provide  better  access  to  the  information  collected  and  more 
reasonable  means  for  future  investigations.  Secondly,  the  data  itself  is  poorly  entered  and 
managed.  There  were  numerous  errors  and  inconsistencies.  The  database,  whether  with 
PHOENIX  or  some  new  system,  must  be  properly  maintained.  The  data  must  be  put  in 
accurately  and  consistently  in  order  to  obtain  meaningful  results  from  future  studies. 

Another  area  of  concern  was  the  cturent  practices  of  the  74th  SGPO;  namely  the 
estabhshed  normals  and  the  use  of  just  ALT.  The  population  used  in  this  study  does  not 
correspond  to  consistent  cut-offs  for  determining  the  normahty  of  an  individual’s  test 
result  when  compared  to  the  estabhshed  normals.  Considering  a  change  in  the  estabhshed 
normals  may  be  appropriate  unless  proper  justification  exists  for  the  current  ones.  In 
regard  to  primarily  monitoring  ALT  since  1994,  the  multivariate  analysis  did  show  it  is 
the  single  most  urportant  hver  function  test.  However,  the  use  of  the  Transferase  Index 
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(or  at  least  the  tests  used  as  inputs  to  the  index)  may  provide  more  accurate  assessments 
of  the  prevalence  of  Uver  disease. 

Regardless  of  the  specific  values  or  criterion  used  to  classify  an  individual  as 
normal  or  abnormal,  SPC,  namely  control  charts,  provide  a  ready  means  for  monitoring 
worker  health.  The  control  charts  present  information  in  a  meaningful  and  easy  to 
understand  manner  which  requires  minimal  understanding  of  statistics  to  the  medical 
practitioner.  Further,  the  use  of  a  demerit  system  better  captures  the  severity  of 
abnormahty,  an  important  issue  in  medical  surveillance. 

Follow-On  Work 

While  this  research  fully  accompHshed  the  set  objectives,  there  exists  areas  for 
possible  fiiture  research. 

Criterion.  A  study  could  be  done  to  find  the  most  accmate  criterion  for  identifying 
fiver  disease.  Possible  criterion  include  those  used  in  this  study,  the  Transferase  Index, 
and  a  combination  of  other  fiver  function  tests. 

Demerit  System  An  exploration  into  a  demerit  system  to  find  optimal 
classifications  and  weights  may  provide  better  results.  A  study  of  this  nature  could  be 
applied  to  the  results  from  this  research  or  many  other  areas  of  interest. 

Body  Systems.  While  this  research  was  done  in  conjimction  with  a  similar  study 
on  pulmonary  functions,  other  body  systems  could  be  studied.  Comparing  results  may  add 
fixrther  insight  into  occupational  exposures. 
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r'.nmmoii  Exposufcs.  This  research  only  identified  abnormal  zones  and  did  not 
investigate  the  actual  causes  of  liver  disease.  Identifying  common  exposures  is  a  logical 
next  step  in  preventing  workplace  exposures. 

Conyosite  Hftaltb  Index.  This  study  only  examined  liver  fimctions.  Ideally,  an 
overall  composite  health  index  to  measure  worker  health,  can  be  developed.  Such  an 
index  would  greatly  assist  medical  surveillance  efforts. 
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APPENDIX  A 

SCREENING  ENZYME  TESTS 
Adopted  from  Fischbach,  1992 


ALT  (SGPT) 

This  test  of  enzyme  levels  is  done  primarily  to  diagnose  liver  disease.  High 
concentration  of  the  enzyme  occur  in  the  liver,  and  relatively  low  concentrations  are  found 
in  the  heart,  muscle,  and  kidney.  These  enzymes  are  also  used  to  monitor  the  course  of 
treatment  for  hepatitis,  active  postnecrotic  cirrhosis,  or  the  effects  of  drug  treatment  that 
might  be  toxic  to  the  liver.  This  test  is  also  used  to  differentiate  between  hemolytic 
jaundice  and  jaundice  due  to  liver  disease.  In  comparison  to  AST,  the  ALT  test  is  more 
specific  for  liver  malfunction. 

AST  (SGOT) 

AST  is  an  enzyme  present  in  tissues  of  high  metabohc  activity.  It  occurs  in 
decreasing  concentration  in  the  heart,  liver,  skeletal  muscle,  kidney,  brain,  pancreas, 
spleen,  and  lungs.  The  enzyme  is  released  into  the  circulation  following  the  injury  or  death 
of  cells.  Any  disease  that  causes  change  in  these  highly  metabohc  tissues  will  result  in  a 
rise  in  AST.  The  amount  of  AST  in  the  blood  is  directly  related  to  the  number  of 
damaged  cells  and  the  amount  of  time  that  passes  between  injury  to  the  tissue  and  the  test. 
In  liver  disease,  the  level  may  be  10  to  100  times  the  normal.  Also,  liver  disease 
occasionally  may  cause  a  decrease  instead  of  the  expected  increase. 
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GGT 


The  enzyme  y-glutamyl  transferase  is  present  mainly  in  the  liver,  kidney,  prostate, 
and  spleen.  The  liver  is  considered  the  source  of  normal  serum  activity,  despite  the  fact 
that  the  kidney  has  the  highest  level  of  the  enzyme.  This  enzyme  is  believed  to  function  in 
the  transport  of  amino  acids  and  peptides  into  cells  across  the  cell  membranes  and  to  be 
involved  in  glutathione  metabolism  Men  will  have  higher  normal  levels  because  of  the 
large  amoimts  fovmd  in  the  prostate.  This  test  is  used  to  determine  liver  cell  dysfunction 
and  to  detect  alcohol-induced  liver  disease.  It  is  also  an  eflhcient  way  to  screen  for 
consequences  of  chronic  alcoholism  The  GGT  is  very  sensitive  to  the  amount  of  alcohol 
consumed  by  chronic  drinkers.  It  can  be  used  to  monitor  the  cessation  or  reduction 
alcohol  consumption.  GGT  activity  is  elevated  in  all  forms  of  liver  disease. 

Bilirubin 

Bilirubin,  resulting  from  the  breakdown  of  hemoglobin  in  the  red  blood  cells,  is  a 
by-product  of  hemolysis  (red  blood  cell  destruction).  A  rise  in  serum  levels  will  occur  if 
there  is  an  excessive  destruction  of  red  blood  cells  or  if  the  liver  is  unable  to  excrete  the 
normal  amoimts  of  bilirubin  produced.  A  normal  level  of  total  bilirubin  rules  out  any 
significant  in^airment  of  the  excretory  function  of  the  liver  or  excessive  hemolysis  or  red 
blood  cells. 
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Albumin 


Proteins  and  nucleic  acids,  the  structural  component  of  a  cell,  serve  as  biocatalysts 
(enzymes),  regulators  of  metabolism  (hormones),  and  preservers  of  genetic  makeup 
(chromosomes).  Amino  acids  are  the  building  blocks  of  proteins.  Albumin  is  a  protein 
that  is  formed  in  the  liver  and  that  helps  to  maintain  normal  distribution  of  water  in  the 
body  (colloidal  osmotic  pressure).  It  also  helps  in  the  transport  of  blood  constituents  such 
as  ions,  pigments,  bilirubin,  hormones,  fatty  acids,  enzymes,  and  certain  drugs. 
Approximately  53%  to  60%  of  total  protein  is  albumin.  Decreased  albumin  levels  are 
caused  by  many  different  conditions.  Increased  albiunin  levels  are  generally  not  observed. 

AP 

Alkaline  phosphatase  is  an  enzyme  originating  mainly  in  the  bone,  liver,  and 
placenta,  with  some  activity  in  the  kidney  and  intestines.  It  is  called  alkaline  because  it 
functions  best  at  a  pH  of  9.  This  enzyme  test  is  used  as  a  tumor  marker  and  an  index  of 
liver  and  bone  disease,  when  correlated  with  other  clinical  findings.  In  liver  disease,  the 
blood  level  rises  when  excretion  of  this  enzyme  is  impaired  as  a  result  of  obstruction  in  the 
bihary  tract. 
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APPENDIX  B 
SAS  PROGRAMS 

The  essential  components  of  the  three  SAS  programs  used  to  develop  a  SAS 
p.nmpatihle  database  are  included  in  Appendix  B.  They  are  CONVERT.  SAS  which 
converts  the  ASCII  files  to  SAS  con^atible  files,  SGPTRAW.SAS  which  eliminates 
multiple  SSANs  (similar  programs  were  developed  for  each  fiver  function  test  variable), 
and  MERGEALL.SAS  which  merges  all  *.RAW  files  into  the  HEALTH. WPAFB2 
database  we  developed.  A  number  of  other  programs  were  developed  for  data 
exploration,  zone  classification,  factor  analysis  (through  the  use  of  PROC  FACTOR),  and 
extracting  coimts  abnormalities  based  on  varying  criteria  for  each  of  the  zones.  Additional 
program  templates  can  be  found  in  the  thesis  completed  by  Cpt  Paul  McAree,  GOR-96M. 


APPENDIX  B.l 
CONVERT.SAS 


libname  health  'user2'; 

/*  Similar  sections  were  done  for  each  of 
the  seven  extracted  files.  Those 
essential  to  the  liver  data  are  indued 
here.*/ 

data  health,  chem; 
infile  'bryanl.'; 

mput  first  $  1  ssan  $  1-9  sgpt  11-14 
sgot  16-19  yr  23-24  mo  26-27  dy 
29-30  ap  32-35  ggt  37-40  bili  42- 
45  albumin  47-50; 
ifindex('0123456789',first)>0; 
if  ap  =  4303  then  delete; 
if  sgpt  =  .  &  sgot  =  .  &  ap  =  .  &  ggt 
=  .  &  bili  =  .  &  albumin  =  .  then 
delete; 

format  chemdate  yymmdd6.; 
chemdate  =  mdy(mo,dy,yr); 
drop  mo  dy  yr; 

run; 

data  health. blood; 


infile  'thesis4.'; 

input  first  $  1  ssan  $  1-9  wbc  $  11-14 
yr  20-21  mo  23-24  dy  26-27 
hemcrit  29-33; 

if  index('0 123456789’,first)>0; 
format  blddate  yymmdd6.; 
blddate  =  mdy(mo,dy,yr); 
drop  mo  dy  yr; 
run; 

data  health.zone; 
iofile  'thesis5.'; 

input  first  $  1  ssan  $  1-9  zone  $11- 
18  syr  22-23  smo  25-26  sdy  28-29 
eyr  33-34  emo  36-37  edy  39-40; 
if  index('0 123456789',fiirst)>0; 
format  stdate  yymmdd6.  enddate 
yymmdd6.; 

stdate  =  mdy(smo,sdy,syr); 
enddate  =  mdy(emo,edy,eyr); 
drop  smo  sdy  syr  emo  edy  eyr; 
run; 
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Ubname  health  'userS'; 
run; 

options  Is  =  75  ; 
proc  sort  data  =  health,  chem; 
by  ssan; 

run; 

data  _null_; 
set  health,  chem; 

,  by  ssan; 
file  print  notitles; 
if  first,  ssan  then  do; 

put  @1  ssan  @1 1  sgpt  @; 


APPENDIX  B.2 
SGPTRAW.SAS 

n=  15; 

end; 

if  first,  ssan  =  0  and  last,  ssan  =  0  then  do; 
put  @n  sgpt  @  ; 
n  =  n+4; 
retain  n; 

end; 

if  last,  ssan  then  do; 

put  @n  sgpt  @75  first; 

end; 

run; 
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APPENDIX  B.3 
MERGEALL.SAS 


/*  Similar  coiri|)onents  were  done  for  all 
*RAW  files  (including  those  for  the 
pulmonary  research;  data  sets  a  through 
h).  Only  those  apphcable  to  this  effort 
are  included  here.*/ 

data  i; 

infile  'sgpt.raw*; 

input  ssan  $  1-9  sgptl  1 1-14  sgpt2 
15-18  sgptS  19-22  sgpt4  23-26 
sgptS  27-30  sgpt6  3 1-34  sgpt7 
35-38  sgpt8  39-42; 

run; 

proc  sort  data  =  i;  by  ssan  ;  run; 
dataj; 

infile  'sgot-raw*; 

input  ssan  $  1-9  sgotl  11-14  sgot2 
15-18  sgot3  19-22  sgot4  23-26 
sgot5  27-30  sgot6  3 1-34  sgot7 
35-38  sgot8  39-42; 

run; 

proc  sort  data  =  j;  by  ssan  ;  run; 
data  k; 

infile  'ap.raw*  ; 

input  ssan  $  1-9  apl  11-14  ap2 
15-18  ap3  19-22  ap4  23-26 
ap5  27-30  ap6  31-34  ap7  35- 
38  ap8  39-42; 

run; 

proc  sort  data  =  k;  by  ssan  ;  run; 
data  1; 

infile  ’ggi-raw" ; 

input  ssan  $  1-9  ggtl  1 1-14  ggt2 
15-18  ggt3  19-22  ggt4  23-26 
ggt5  27-30  ggt6  31-34ggt7 
35-38  ggt8  39-42; 

run; 

proc  sort  data  ==  1;  by  ssan  ;  run; 
data  m; 

infile  'bili.raw'; 


input  ssan  $  1-9  bilil  11-15  bili2  16- 
20bih3  21-25  bili4  26-30  bih5 
31-35  bili6  36-40  bili7  41-45  bm8 
46-51; 
run; 

proc  sort  data  =  m;  by  ssan  ;  run; 
data  n; 

infile  'albumin,  raw* ; 
input  ssan  $  1-9  albuminl  11-15 
albumin 2  16-20  albumin3  21-25 
albiimin4  26-30  albumin5  31-35 
albumin6  36-40  albumin7  41-45 
albumin8  46-50; 
run; 

proc  sort  data  =  n;  by  ssan  ;  run; 
data  o; 

infile  'chemdate.raw'; 
input  ssan  $  1-9  yrl  11-12  mol  13-14 
dyl  15-16  yr2  20-21  mo2  22-23 
dy2  24-25  yr3  29-30  mo3  31-32 
dy3  33-34  yr4  38-39  mo4  40-41 
dy4  42-43  yr5  47-48  mo5  49-50 
dy5  51-52  yr6  56-57  mo6  58-59 
dy6  60-61  ytl  65-66  mo7  67-68 
dy7  69-70  yr8  74-75  mo8  76-77 
dy8  78-79; 

format  cdtl  cdt2  cdt3  cdt4  cdt5  cdt6 
cdt7  cdt8  yymmdd6.; 
cdtl  =  mdy(mol,dyl,yrl); 
cdt2  =  mdy(mo2,dy2,yT2); 
cdt3  =  mdy(mo3,dy3,yr3); 
cdt4  =  mdy(mo4,dy4,yr4); 
cdt5  =  mdy(mo5,dy5,yr5); 

•  cdt6  =  mdy(mo6,dy6,yr6); 
cdt7  =  mdy(mo7,dy7,yr7); 
cdt8  =  mdy(mo8,dy8,yr8); 
drop  yrl  mol  dyl  yr2  mo2  dy2  yr3 
mo3  dy3  yr4  mo4  dy4  yr5  mo5  dy5 
yr6  mo6  dy6  yr7  mo7  dy7  yr8  mo8 
dy8; 
run; 
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proc  sort  data  =  o;  by  ssan  ;  run; 

/*  Data  sets  p,q,  and  r  were  for  data  not 
used  in  the  analysis*/ 
data  s; 

infile  'zone.raw*  ; 

input  ssan  $  1-9  zonel  $11-19  zone2 
$20-28  zones  $29-37  zoned  $38- 
46  zone5  $47-55  zone6  $56-64 
zone7  $65-73  ; 
run; 

proc  sort  data  =  s;  by  ssan  ;  run; 
data  t; 

infile 'stdate.  raw*  ; 

input  ssan  $  1-9  yrl  11-12  mol  13-14 
dyl  15-16  yr2  20-21  mo2  22-23 
dy2  24-25  yr3  29-30  mo3  31-32 
dy3  33-34  yr4  38-39  mo4  40-41 
dy4  42-43  yr5  47-48  mo5  49-50 
dy5  51-52  yr6  56-57  mo6  58-59 
dy6  60-61  yr7  65-66  mo7  67-68 
dy7  69-70; 

format  sdtl  sdt2  sdt3  sdt4  sdt5  sdt6 
sdt7  yymmdd6.; 

sdtl  =mdy(mol,dyl,yrl); 

sdt2  =  mdy(mo2,dy2,yr2); 
sdt3  =  mdy(mo3,dy3,yr3); 
sdt4  =  mdy(mo4,dy4,yr4); 
sdt5  =  mdy(mo5,dy5,yr5); 
sdt6  =  mdy(mo6,dy6,yr6); 
sdt7  =  mdy(mo7,dy7,yr7); 
drop  yrl  mol  dyl  yr2  mo2  dy2  yr3 
mo3  dy3  yr4  mo4  dy4  yr5  mo5  dy5 
yr6  mo6  dy6  yr7  mo7  dy7; 
run; 

proc  sort  data  =  t;  by  ssan  ;  run; 
data  u; 


infile  'enddate.raw' ; 
input  ssan  $  1-9  yrl  11-12  mol  13-14 
dyl  15-16  yr2  20-21  mo2  22-23 
dy2  24-25  yr3  29-30  mo3  31-32 
dy3  33-34  yr4  38-39  mo4  40-41 
dy4  42-43  yr5  47-48  mo5  49-50 
dy5  51-52  yr6  56-57  mo6  58-59 
dy6  60-61  yr7  65-66  mo7  67-68 
dy7  69-70; 

format  edtl  edt2  edt3  edt4  edt5  edt6 
edt7  yymmdd6.; 
edtl  =mdy(mol,dyl,yrl); 
edt2  =  mdy(mo2,dy2,yr2); 
edt3  =  mdy(mo3,dy3,yr3); 
edt4  =  mdy(mo4,dy4,yr4); 
edt5  =  mdy(mo5,dy5,yr5); 
edt6  =  mdy(mo6,dy6,yr6); 
edt7  =  mdy(mo7,dy7,yr7); 
drop  yrl  mol  dyl  yr2  mo2  dy2  yr3 
mo3  dy3  yr4  mo4  dy4  yr5  mo5  dy5 
yr6  mo6  dy6  yr7  mo7  dy7; 
run; 

proc  sort  data  =  u;  by  ssan  ;  run; 

/*  Data  sets  v  and  w  were  for  data  not 
used  in  the  analysis*/ 
hbname  health  'userS'; 
data  health.  wpafb2; 
merge  a  b  (in=inl)  c  d  e  f  g  h  i 

(m=m2)  j  k  1  m  n  o  p  q  (in=in3)  r  s 
t  u  V  w; 
by  ssan; 

if  ini  oriii2  or  in3; 
if  zonel  =  '  '  then  delete; 
run; 

proc  contents; 
run; 
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